By the end of this hands-on tutorial you should be familiar with:
The minimum knowledge to benefit from the tutorial includes being familiar with:
You are leading a research project involving a systematic review of multiple interventions for a specific health condition and patient population. You have put in a lot of effort to establish a high-standards protocol, registered it with PROSPERO register and completed the data extraction.
While reviewing the extracted data, you notice that many eligible trials report having missing participants. How are you going to proceed with the analysis?
Consider the small fictional trial in Figure 1, where 20 participants have been randomized into two interventions and the outcome is binary (symptoms improved or worsened after intervention). Some participants completed the trial (completers), while others left the trial prematurely for several reasons (missing). The goal is to draw conclusions using the entire randomized sample.
Figure 1: A fictional small trial of two interventions
Without individual participant data, there are only a few methods to handle missing participants in (network) meta-analysis:
Both exclusion and imputation are suboptimal methods in handling missing participants. Exclusion, as shown in Figure 2, is not appropriate as it reduces the sample to those participants who remained in the trial and decreases the power to detect significant associations. Moreover, if the missingness mechanism is not M(C)AR, the risk of estimating a biased intervention effect (e.g., odds ratio) is imminent.
Figure 2: Excluding missing participants from both interventions
Figure 3 illustrates the imputation of missing participants in two interventions. In the first intervention, it is assumed that all missing participants experienced symptom improvement, and thus, left the trial prematurely. In the second intervention, it is assumed that all missing participants experienced symptom deterioration and left the trial prematurely. Imputation implies adding the number of missing participants to the outcome based on the corresponding scenario: to the number of events (improvement) in the first intervention and to the number of non-events (deterioration) in the second intervention. This method preserves the randomized sample but takes the missingness scenario for granted. Without following the missing participants to record their outcome, we cannot know the true missingness mechanism. Hence, any assumptions we make about the missingness mechanism should naturally propagate in the estimation of the intervention effect by increasing its standard error. Imputation rears its ugly head by ‘treating’ the missing participants as observed, increasing the risk to estimate a biased intervention effect and draw erroneous statistical significance conclusions.
Figure 3: Imputing missing participants making different scenarios in each intervention
(Network) meta-analysis inherits the limitations of exclusion and imputation observed at trial-level via the inclusion of trials with missing participants. As you do not have individual participant data to make use of different cool methods for missing participants, the quality of the (network) meta-analysis results will strongly depend, among others things, on your good guess about the missingness mechanism and how you incorporate this good guess into the results. We have good news on that; just, keep reading.
The last member of the squad of methods for missing participants in (network) meta-analysis is the pattern-mixture model. The pattern-mixture model elegantly adjusts, rather than fixes, the average outcome (in this case, the risk of event) to the assumed missingness mechanism, and propagates this assumption to the standard error of the estimated intervention effect, while maintaining the randomized sample. It allows for different assumptions about the missing mechanism between interventions or across the trials, offering a flexible framework for properly handling missing participants in (network) meta-analysis. If exclusion, imputation and pattern-mixture model were characters from the epic Western spaghetti film, The Good, the Bad and the Ugly, the correspondence of who would be who is rather clear.
Figure 4 shows two normal distributions of
the informative missingness odds ratio (IMOR), which is
a central parameter in the pattern-mixture model for a binary outcome.
IMOR is similar to the odds ratio for two interventions, as it compares
the odds of an event in missing participants with the odds of an event
in completers. An IMOR equal to 1 indicates the MAR mechanism (i.e., no
difference in the compared groups), and an IMOR different from 1 implies
the MNAR mechanism (i.e., the compared groups differ). To illustrate
this, we have considered a different IMOR distribution for each
intervention. For the the first intervention, we assumed that, on
average, the odds of an event
in completers were twice that in missing participants (the mode), with
other scenarios being less likely to occur as we deviate from the mode.
For the second intervention, we assumed that, on average, the odds of an
event in missing participants were twice that in completers (the mode),
again, with other scenarios being less likely to occur as we deviate
from the mode. It is important to note that the normal distributions
presented in Figure 4 are implied for the IMOR
parameter in the logarithmic scale, which is commonly used when
estimating the odds ratio.
Figure 4: The IMOR parameter under different scenarios for each intervention
The rnmamod R package was originally developed to address the lack of dedicated functions for handling missing participants in pairwise and network meta-analysis in existing R packages. Over time, our goal expanded to create a comprehensive suite of functions for performing and visualizing Bayesian pairwise and network meta-analysis. The package implements the pattern-mixture model for proper handling of missing participants in all models. In this tutorial, we will walk you through the core functions of the package and provide a comprehensive visual summary of the results.
You can install and load the package directly from CRAN by running the code below:
install.packages("rnmamod")
library(rnmamod)
However, we recommend installing the development version to experience the latest advances in the package:
install.packages("devtools")
devtools::install_github("LoukiaSpin/rnmamod")
We will use the dataset of Baker et
al. (2009), which includes 21 trials comparing seven pharmacological
interventions and placebo in chronic obstructive pulmonary disease
(COPD) patients. The aim is to determine if patients experienced COPD
exacerbation after receiving the randomized intervention. Run
head(nma.baker2009) to see the first six trials, or
nma.baker2009 to see the entire dataset. The dataset has
one-trial-per-row format, which is the typical format encountered in
BUGS language models.
head(nma.baker2009)
#> study t1 t2 t3 t4 r1 r2 r3 r4 m1 m2 m3 m4 n1 n2 n3 n4
#> Llewellyn-Jones, 1996 1 4 NA NA 3 0 NA NA 1 0 NA NA 8 8 NA NA
#> Paggiaro, 1998 1 4 NA NA 51 45 NA NA 27 19 NA NA 139 142 NA NA
#> Mahler, 1999 1 7 NA NA 47 28 NA NA 23 9 NA NA 143 135 NA NA
#> Casaburi, 2000 1 8 NA NA 41 45 NA NA 18 12 NA NA 191 279 NA NA
#> van Noord, 2000 1 7 NA NA 18 11 NA NA 8 7 NA NA 50 47 NA NA
#> Rennard, 2001 1 7 NA NA 41 38 NA NA 29 22 NA NA 135 132 NA NA
To create a network plot use the netplot function (type
?netplot to learn more):
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# Create the network plot and tables
netplot(data = nma.baker2009,
drug_names = interv_names,
text.cex = 1.5)
The function currently calls the
nma.networkplot function
from the pcnetmeta R
package. The netplot function gives further insights into
the network evidence by printing the following three tables on the
console:
* A summary of the trials, randomized sample, and outcome data
for each observed comparison in the network
The nma.baker2009 dataset was selected due to its high
number of missing participants in most trials, interventions and
comparisons. Use the heatmap_missing_dataset function to
view the distribution of missing participants (in percent) in the
dataset. Most trials have interventions with high level of missingness,
with only a few trials having zero or low missingness rate.
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# Create the heatmap of % missing participants for each trial and arm
heatmap_missing_dataset(data = nma.baker2009,
trial_names = nma.baker2009$study,
drug_names = abbr_interv_names)
Use the heatmap_missing_network function to view the
median and range of missing participants (in percent) for each
intervention (on the main diagonal) and comparison (lower
off-diagonal):
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE plus FORM", "FLUT", "FLUT plus SALM", "FORM", "SALM", "TIOT")
# Create the heatmap of % missing participants for each intervention and observed comparison
heatmap_missing_network(data = nma.baker2009,
drug_names = abbr_interv_names)
We can immediately spot the comparison with the lowest median of missing participants: tiotropium versus formoterol.
This section introduces you to the backbone function of the
rnmamod
architecture: the run_model. This function conducts
Bayesian network meta-analysis while accounting for
missing participants using the pattern-mixture model. Most functions in
the package cannot work without the run_model function and,
on its own, run_model returns results for a bulk of model
parameters. The magic happens once the run_model function
is fed into the other functions of the package, which enable you to run
all necessary analyses and obtain various graphs to understand and
interpret the results.
Let’s run a Bayesian random-effects (model = "RE")
network meta-analysis while addressing missing participants reported in
each intervention arm of every trial. The effect measure of interest is
the odds ratio (measure = "OR"). A half-normal prior
distribution is used for the between-trial standard deviation with scale
parameter equal to 1
(heter_prior = list("halfnormal", 0, 1)) and one IMOR
parameter is estimated for each intervention in the network
(assumption = "IDE-ARM"). The model operates under the
MAR assumption on average (mean_misspar = c(0, 0))
with a variance equal to 1 (var_misspar = 1) for the
outcome being harmful (D = 0). Assign a name to the
function to use it as an object in the other functions of the package.
The run_model function and all other model functions run in
JAGS.
## Run a random-effects network meta-analysis
res <- run_model(data = nma.baker2009,
measure = "OR",
model = "RE",
assumption = "IDE-ARM",
heter_prior = list("halfnormal", 0, 1),
mean_misspar = c(0, 0),
var_misspar = 1,
D = 0,
ref = 1,
n_chains = 3,
n_iter = 10000,
n_burnin = 1000,
n_thin = 1)
Specifying carefully the arguments of the
run_modelfunction is essential for the subsequent analyses to yield sensible results. This requires thorough consideration of the proper effect measure (measure), the meta-analysis model (model), the structure of the missingness parameter (assumption), the prior distribution for the heterogeneity parameter (heter_prior), the missingness mechanism (mean_missparandvar_misspar), the direction of the outcome (D), and the necessary tuning parameters to run Markov chain Monte Carlo simulations using JAGS (n_chains,n_iter,n_burning, andn_thin). Refer to the documentation of the function by typing?run_model.
Now we can use res in the arguments of the other model
functions to conduct the subsequent analyses, and visualize the results.
The journey has just begun.
Use the mcmc_diagnostic function, to check the model’s
convergence.
# Display diagnostics for Bayesian methods
mcmc_diagnostics(net = res,
par = c("EM", "tau"))
Here, we illustrate a panel of bar plots on the log odds ratio for all possible comparisons in the network (x-axis):
It displays diagnostic plots (trace, density, and autocorrelation)
for the monitored parameters specified in the par argument
(here, the log odds ratio for all possible pairwise comparisons,
"EM", and the common between-trial standard deviation,
"tau"). Access all monitored parameters from
run_model using res$jagsfit. The HTML file is
created thanks to the mcmcplot function of the mcmcplots R
package.
Use the league_heatmap function to create the popular
league table:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The heatmap league table for estimation
league_heatmap(full1 = res,
drug_names1 = interv_names)
The table is read as row versus column and shows the (posterior
median) odds ratio and 95% credible interval for each comparison. The
interventions are sorted in decreasing order by their posterior mean
SUCRA value shown in the main diagonal. The larger the treatment effect,
the darker the color shade. The league_heatmap function can
also display the treatment effects from two outcomes or a selected
subset of interventions (ideal for huge networks).
The league_heatmap_pred function has the same
arguments as league_heatmap, but it displays
predictions for all possible pairwise comparisons in the network:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The heatmap league table for prediction
league_heatmap_pred(full1 = res,
drug_names1 = interv_names)
A compact alternative to the league table is to create a forest plot of comparisons with a selected comparator intervention. For illustration, we have selected budesidone as the reference and we ran the following code to obtain the forest plot where results on estimation and prediction co-appear:
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE+FORM", "FLUT", "FLUT+SALM", "FORM", "SALM", "TIOT")
# Forest plot for estimation & prediction
forestplot(full = res,
compar = "BUDE",
drug_names = abbr_interv_names)
In plot A), the 95% credible intervals (black lines)
overlay the 95% prediction intervals (orange lines). In both plots, the
interventions are sorted from best to worst based on their SUCRA value
(see plot B)).
The rankosucra_plot function creates a plot for each
intervention that combines two popular hierarchy measures: the rankogram
and SUCRA plot:
# The names of the interventions are in the order they appear in the dataset
interv_names <- c("placebo", "budesonide", "budesonide plus formoterol", "fluticasone", "fluticasone plus salmeterol", "formoterol", "salmeterol", "tiotropium")
# The rankogram-SUCRA-plot 'amalgamation'
rankosucra_plot(full1 = res,
drug_names1 = interv_names)
The interventions are sorted from best to worst based on their SUCRA
value (the number on the top left corner of each plot). The
rankosucra_plot function can also display the hierarchy
results from two outcomes; type ?rankosucra_plot for more
details.
The functions run_series_meta and
series_meta_plot are designed to explore the implications
of assuming consistency and common heterogeneity, as opposed to
performing pairwise meta-analysis separately for each observed
comparison in the network.
First, use the run_series_meta function to conduct
separate pairwise meta-analyses for all observed comparisons in the
network:
# Run separate pairwise meta-analyses
meta <- run_series_meta(full = res)
Here, the function inherits the arguments n_chains,
n_iter, n_burning, and n_thin
from the run_model function via res; however,
you can also specify these arguments anew.
Next, we insert meta into the function
series_meta_plot to experience the visualization toolkit
for this analysis:
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE+FORM", "FLUT", "FLUT+SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for a series of pairwise meta-analyses
series_meta_plot(full = res,
meta = meta,
drug_names = abbr_interv_names)
This function returns the same results in two formats: (i) a table that is printed on the console, and (ii) a panel of forest plots. Below, we present the plots.
Network meta-analysis produced more precise results than pairwise meta-analysis for all observed comparisons (plot A)). The benefits of conducting network meta-analysis for this example are also evident for the heterogeneity parameter, which was estimated at a reasonable level and with greater precision than in any pairwise meta-analysis (plot B)): blue vertical lines refer to the posterior median and 95% credible interval of the common heterogeneity parameter in network meta-analysis.
You can export the table to an xlsx file in your working
directory by adding the argument save_xls = TRUE in the
series_meta_plot function. This argument is also found in
the following functions that summarize and present the model results:
nodesplit_plot, ume_plot, and
metareg_plot.
Note that comparing network meta-analysis with pairwise meta-analysis results should never be used to draw conclusions about the possibility of consistency assumption! There are specific tools designed to evaluate the consistency assumption. In the next part of the tutorial, we will introduce you to functions that offer local and global assessment of the consistency assumption.
Do you see at least one closed-loop of interventions in your network? Are these loops informed by two-arm trials only, or a combination of two-arm and multi-arm trials? If so, you must evaluate whether network meta-analysis produces sensible results under the consistency assumption or if there are loops with evidence of possible inconsistency. Our COPD network has several closed loops (see the network plot above).
The run_nodesplit function applies the node-splitting
approach with the synergy of the R package gemtc to select the
proper nodes to split (read van Valkenhoef and
colleagues for more details on which nodes to split when
multi-arm trials are also present in the loops):
# Run the node-splitting approach
node <- run_nodesplit(full = res)
Just as with the run_series_meta function,
run_nodesplit can inherit the MCMC-related arguments from
the run_model function via res, or we can
specify these arguments anew.
Next, we insert node into the function
nodesplit_plot to get the necessary results both in
tabular and graphical format:
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE+FORM", "FLUT", "FLUT+SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for the consistency evaluation locally
nodesplit_plot(full = res,
node = node,
drug_names = abbr_interv_names)
The first graph is a panel of interval plots to check consistency through the inconsistency factor (the difference between direct and indirect estimate):
For each split node, we can see the estimated direct effect (from pairwise meta-analysis), the indirect effect (after removing the corresponding comparison from the network), and the inconsistency factor (IF). The deviance information criterion for each model appear on the top left of each plot. The plots have been sorted in ascending order of the deviance information criterion. The 95% credible interval for the inconsistency factor crosses the vertical line (of consistency) in all plots. One may conclude that the consistency assumption holds; however, the point estimate is not even close to zero in all plots. Thinking hat on the head.
The next graph is a ‘reverse’ forest plot on the between-trial standard deviation after each split node:
Again, the split nodes have been sorted in ascending order of the deviance information criterion. The blue horizontal lines refer to the posterior median and 95% credible interval of the common heterogeneity parameter in network meta-analysis.
Network meta-regression is another powerful tool to evaluate the
assumptions of network meta-analysis. We will investigate whether the
intervention effects change over time. We created the vector
cov with the publication year of the trials in the order
they appear in the dataset nma.baker2009.
Run the following code to conduct network meta-regression using the
function run_metareg:
# The year of publication
cov <- c(1996, 1998, 1999, rep(2000, 2), 2001, rep(2002, 5), rep(2003, 2), rep(2005, 4), rep(c(2006, 2007), each = 2))
# Run network meta-regression
reg <- run_metareg(full = res,
covariate = cov,
covar_assumption = "exchangeable")
As you may have guessed by now, we insert reg into the
function metareg_plot to obtain several results in
tabular and graphical format. We will glimpse into the
plots:
# Abbreviated interventions (in the order they appear in the dataset)
abbr_interv_names <- c("placebo", "BUDE", "BUDE+FORM", "FLUT", "FLUT+SALM", "FORM", "SALM", "TIOT")
# End-user-ready results for network meta-regression
metareg_plot(full = res,
reg = reg,
compar = "BUDE",
cov_value = list(2002, "Publication year"),
drug_names = abbr_interv_names)
The 95% credible intervals (black lines) overlap the 95% prediction
intervals (colored lines) in plot A). The odds ratios
refer to comparisons with budesidone, the selected comparator
intervention (compar = "BUDE"). In both plots, the
interventions are sorted from best to worst based on their SUCRA value
(see plot B)).
The hierarchy of the interventions does not materially change when comparing the two models regarding the SUCRA values for the publication year 2002.
The main functionalities of rnmamod have been presented in this tutorial. The interested reader is prompted to browse through the manual to dive in the technical details of the functions.
For any questions about the tutorial and the rnmamod R package, feel free to contact Spineli.Loukia@mh-hannover.de. For any information about the author, visit loukiaspineli.netlify.app.